Take-home Ex03

Author

Ryan Peh

Published

October 31, 2024

Modified

November 3, 2024

Overview and Objectives

In this take-home my aim is to prototype and evaluate the visualisation of pages and components in our Shiny application.

My responsibilities will be involving the Geographically Weighted Regression (GWR) on the districts of Malaysia. The GWR would focus on modelling crime rates of each district with household income, income inequality and poverty.

Data

Packages

  • olsrr: Provides tools for building and validating OLS regression models with stepwise selection and diagnostics.

  • ggstatsplot: Extends ggplot2 with statistical tests and data visualization in a single, user-friendly syntax.

  • sf: Provides a standardized way to work with spatial vector data (points, lines, polygons).

  • tmap: Creates thematic maps with interactive and static modes for spatial visualization.

  • tidyverse: A collection of packages for easy data manipulation, visualization, and analysis.

  • performance: Offers tools to assess, validate, and compare regression models.

  • see: Supplies visual themes and color palettes to enhance ggplot2 visualizations.

  • sfdep: Provides spatial dependency analysis tools specifically for sf objects.

  • GWmodel: Implements geographically weighted regression (GWR) and spatial analysis methods.

  • lubridate: Simplifies the handling and manipulation of dates and times.

pacman::p_load(olsrr, ggstatsplot, sf, tmap, tidyverse, performance, see, sfdep, GWmodel, lubridate)

UI Design

Motivation

Using Geographically Weighted Regression (GWR) to explore the relationship between crime rates and socioeconomic factors like household income, income inequality, and poverty allows us to understand how these factors influence crime spatially rather than uniformly across all regions. This approach is motivated by the premise that socio-economic conditions impact crime differently depending on local contexts—certain areas with high poverty may experience more crime due to limited economic opportunities, while other regions might be more influenced by income inequality. By applying GWR, we can identify these spatial variations and better understand where and how each factor contributes to crime. This spatial insight not only refines our understanding of the socio-economic drivers of crime but also supports the development of targeted interventions, enabling policymakers to address crime in a more focused and locally relevant manner.

A Shiny app could visualize the spatial relationship between crime rates and socioeconomic factors like household income, income inequality, and poverty by displaying interactive, region-specific maps. Users could toggle between map layers to see how each factor correlates with crime across areas, revealing patterns that vary locally. Adjustable settings, such as bandwidth, would allow users to explore the data at different spatial scales, providing insights into whether socioeconomic conditions affect crime more strongly in specific regions. Real-time plots and statistics that update based on user inputs would further enhance understanding, making complex spatial relationships clear and actionable for policymakers and communities aiming to develop targeted crime prevention strategies.

Overall Layout

For the shiny application, three main components will be used for our pages: headerPanel, sidebarPanel, and mainPanel.

  • Header Panel : This is the topmost part of the UI where we can put a description of the application or have a navbar where you can navigate different pages. Each page leads to other group members work/part in this project

  • Sidebar Panel: This panel would mainly consist of the input controls that the user can play around with to change the map output in the Main Panel.

  • Main Panel : This is the primary area of the application and it typically contains outputs. The main panel displays the output (like maps, plots, tables, etc.) based on the input given in the sidebar panel. There would be tabsets that allows users to switch between views.

The various details of the UI will be explored as we try to come up with the data inputs below.

Data

Aspatial

read_csv() of sf package will be used to import the aspatial data into R environment.

crime <- read_csv("data/aspatial/crime_district.csv")
Rows: 19152 Columns: 6
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (4): state, district, category, type
dbl  (1): crimes
date (1): date

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
inequality <- read_csv("data/aspatial/inequality_district.csv")
Rows: 318 Columns: 4
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (2): state, district
dbl  (1): gini
date (1): date

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
income <- read_csv("data/aspatial/income_district.csv")
Rows: 318 Columns: 5
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (2): state, district
dbl  (2): income_mean, income_median
date (1): date

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
poverty <- read_csv("data/aspatial/poverty_district.csv")
Rows: 318 Columns: 5
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (2): state, district
dbl  (2): poverty_absolute, poverty_relative
date (1): date

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Geospatial

st_read() of sf package will be used to import Malaysia shapefile into R environment in order to get the polygons representing the borders of the districts of Malaysia.

msia = st_read(dsn = "data/geospatial", layer = "mys_admbnda_adm2_unhcr_20210211")
Reading layer `mys_admbnda_adm2_unhcr_20210211' from data source 
  `C:\ryanpxp\IS415-GAA\Take-home_Ex\Take-home_Ex03\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 144 features and 14 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 99.64072 ymin: 0.855001 xmax: 119.2697 ymax: 7.380556
Geodetic CRS:  WGS 84

Data Wrangling

Before the UI prototyping can be done let’s see what type of data we are dealing with so that we can better plan for the UI components to be used.

Checking the CRS of the files:

st_crs(msia)
Coordinate Reference System:
  User input: WGS 84 
  wkt:
GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["latitude",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["longitude",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4326]]

It seems to be using WGS 84, lets change that to EPSG:3168 instead.

msia <- msia %>%
  st_transform(crs = 3168)
st_crs(msia)
Coordinate Reference System:
  User input: EPSG:3168 
  wkt:
PROJCRS["Kertau (RSO) / RSO Malaya (m)",
    BASEGEOGCRS["Kertau (RSO)",
        DATUM["Kertau (RSO)",
            ELLIPSOID["Everest 1830 (RSO 1969)",6377295.664,300.8017,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4751]],
    CONVERSION["Rectified Skew Orthomorphic Malaya Grid (metre)",
        METHOD["Hotine Oblique Mercator (variant A)",
            ID["EPSG",9812]],
        PARAMETER["Latitude of projection centre",4,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8811]],
        PARAMETER["Longitude of projection centre",102.25,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8812]],
        PARAMETER["Azimuth of initial line",323.0257905,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8813]],
        PARAMETER["Angle from Rectified to Skew Grid",323.130102361111,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8814]],
        PARAMETER["Scale factor on initial line",0.99984,
            SCALEUNIT["unity",1],
            ID["EPSG",8815]],
        PARAMETER["False easting",804670.24,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",0,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Engineering survey, topographic mapping."],
        AREA["Malaysia - West Malaysia onshore."],
        BBOX[1.21,99.59,6.72,104.6]],
    ID["EPSG",3168]]

Quick glance at the geospatial data:

glimpse(msia)
Rows: 144
Columns: 15
$ ADM2_EN    <chr> "Batu Pahat", "Johor Bahru", "Kluang", "Kota Tinggi", "Kula…
$ ADM2_PCODE <chr> "MY0101", "MY0102", "MY0103", "MY0104", "MY0105", "MY0106",…
$ ADM2_REF   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADM2ALT1EN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADM2ALT2EN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ ADM1_EN    <chr> "Johor", "Johor", "Johor", "Johor", "Johor", "Johor", "Joho…
$ ADM1_PCODE <chr> "MY01", "MY01", "MY01", "MY01", "MY01", "MY01", "MY01", "MY…
$ ADM0_EN    <chr> "Malaysia", "Malaysia", "Malaysia", "Malaysia", "Malaysia",…
$ ADM0_PCODE <chr> "MY", "MY", "MY", "MY", "MY", "MY", "MY", "MY", "MY", "MY",…
$ date       <date> 2020-12-02, 2020-12-02, 2020-12-02, 2020-12-02, 2020-12-02…
$ validOn    <date> 2021-02-11, 2021-02-11, 2021-02-11, 2021-02-11, 2021-02-11…
$ validTo    <date> -001-11-30, -001-11-30, -001-11-30, -001-11-30, -001-11-30…
$ Shape_Leng <dbl> 1.8566472, 1.8288893, 2.3609919, 3.1527941, 1.0739527, 1.40…
$ Shape_Area <dbl> 0.16104119, 0.08016539, 0.23358469, 0.27830606, 0.06154171,…
$ geometry   <MULTIPOLYGON [m]> MULTIPOLYGON (((556714.6 19..., MULTIPOLYGON (…

Quick glance at the crime data:

glimpse(crime)
Rows: 19,152
Columns: 6
$ state    <chr> "Malaysia", "Malaysia", "Malaysia", "Malaysia", "Malaysia", "…
$ district <chr> "All", "All", "All", "All", "All", "All", "All", "All", "All"…
$ category <chr> "assault", "assault", "assault", "assault", "assault", "assau…
$ type     <chr> "all", "all", "all", "all", "all", "all", "all", "all", "caus…
$ date     <date> 2016-01-01, 2017-01-01, 2018-01-01, 2019-01-01, 2020-01-01, …
$ crimes   <dbl> 22327, 21366, 16902, 16489, 13279, 11495, 10348, 10453, 5531,…

Looking at the crime object we can see that “All” fields is included for both districts and type of crime. We want to filter these columns to make it easier to work with and join with geospatial data.

crime_filtered <- crime %>% filter(district != "All")

Date conversion

Taking a look at the other aspatial data:

glimpse(income)
Rows: 318
Columns: 5
$ state         <chr> "Johor", "Johor", "Johor", "Johor", "Johor", "Johor", "J…
$ district      <chr> "Batu Pahat", "Batu Pahat", "Johor Bahru", "Johor Bahru"…
$ date          <date> 2019-01-01, 2022-01-01, 2019-01-01, 2022-01-01, 2019-01…
$ income_mean   <dbl> 7392, 7419, 9315, 9869, 5953, 6461, 6982, 7529, 8602, 91…
$ income_median <dbl> 6504, 6347, 7342, 8232, 4933, 5204, 5475, 6227, 7536, 74…
glimpse(inequality)
Rows: 318
Columns: 4
$ state    <chr> "Johor", "Johor", "Johor", "Johor", "Johor", "Johor", "Johor"…
$ district <chr> "Batu Pahat", "Batu Pahat", "Johor Bahru", "Johor Bahru", "Kl…
$ date     <date> 2019-01-01, 2022-01-01, 2019-01-01, 2022-01-01, 2019-01-01, …
$ gini     <dbl> 0.295, 0.338, 0.388, 0.359, 0.333, 0.354, 0.361, 0.343, 0.324…
glimpse(poverty)
Rows: 318
Columns: 5
$ state            <chr> "Johor", "Johor", "Johor", "Johor", "Johor", "Johor",…
$ district         <chr> "Batu Pahat", "Batu Pahat", "Johor Bahru", "Johor Bah…
$ date             <date> 2019-01-01, 2022-01-01, 2019-01-01, 2022-01-01, 2019…
$ poverty_absolute <dbl> 2.9, 5.1, 3.3, 3.7, 5.0, 7.2, 6.0, 5.0, 3.2, 0.4, 12.…
$ poverty_relative <dbl> 9.0, 19.4, 12.8, 10.4, 24.9, 27.4, 20.8, 17.0, 10.1, …

Looking at the aspatial data, we can see that the date is just the first day of each year and that the month and day does not matter. We shall just convert the column to just the year.

Since the data for poverty, income, and income inequality only has data for 2019 and 2022, we keep only 2019 and 2022 crime data to match the other data.

crime_filtered <- crime_filtered %>%
  mutate(year = year(date))%>%
  select(-date) %>%
  filter(year %in% c(2019, 2022))         
income <- income %>%
  mutate(year = year(date))%>%
  select(-date)           
poverty <- poverty %>%
  mutate(year = year(date))%>%
  select(-date)           
inequality <- inequality %>%
  mutate(year = year(date))%>%
  select(-date)           

Check for empty rows

na <- crime_filtered %>%
  summarise(na_district = sum(is.na(district)),
            na_category = sum(is.na(category)),
            na_type = sum(is.na(type)),
            na_date = sum(is.na(date)),
            na_crimes = sum(is.na(crimes))
            )
Warning: There was 1 warning in `summarise()`.
ℹ In argument: `na_date = sum(is.na(date))`.
Caused by warning in `is.na()`:
! is.na() applied to non-(list or vector) of type 'closure'
print(na)
# A tibble: 1 × 5
  na_district na_category na_type na_date na_crimes
        <int>       <int>   <int>   <int>     <int>
1           0           0       0       0         0
na <- inequality %>%
  summarise(na_district = sum(is.na(district)),
            na_date = sum(is.na(date)),
            na_gini = sum(is.na(gini))
            )
Warning: There was 1 warning in `summarise()`.
ℹ In argument: `na_date = sum(is.na(date))`.
Caused by warning in `is.na()`:
! is.na() applied to non-(list or vector) of type 'closure'
print(na)
# A tibble: 1 × 3
  na_district na_date na_gini
        <int>   <int>   <int>
1           0       0       0
na <- poverty %>%
  summarise(na_district = sum(is.na(district)),
            na_date = sum(is.na(date)),
            na_poverty_absolute = sum(is.na(poverty_absolute)),
            na_poverty_relative = sum(is.na(poverty_relative)),
            )
Warning: There was 1 warning in `summarise()`.
ℹ In argument: `na_date = sum(is.na(date))`.
Caused by warning in `is.na()`:
! is.na() applied to non-(list or vector) of type 'closure'
print(na)
# A tibble: 1 × 4
  na_district na_date na_poverty_absolute na_poverty_relative
        <int>   <int>               <int>               <int>
1           0       0                   0                   0
na <- income %>%
  summarise(na_district = sum(is.na(district)),
            na_date = sum(is.na(date)),
            na_poverty_income_mean = sum(is.na(income_mean)),
            na_poverty_income_median = sum(is.na(income_median)),
            )
Warning: There was 1 warning in `summarise()`.
ℹ In argument: `na_date = sum(is.na(date))`.
Caused by warning in `is.na()`:
! is.na() applied to non-(list or vector) of type 'closure'
print(na)
# A tibble: 1 × 4
  na_district na_date na_poverty_income_mean na_poverty_income_median
        <int>   <int>                  <int>                    <int>
1           0       0                      0                        0

Mismatched district names

District names in crime_filtered that are not in district names of msia.

setdiff(unique(crime_filtered$district), unique(msia$ADM2_EN))
 [1] "Iskandar Puteri"        "Johor Bahru Selatan"    "Johor Bahru Utara"     
 [4] "Nusajaya"               "Seri Alam"              "Bandar Bharu"          
 [7] "Nilai"                  "Cameron Highland"       "Kuala Lipis"           
[10] "Batu Gajah"             "Gerik"                  "Ipoh"                  
[13] "Manjung"                "Pengkalan Hulu"         "Selama"                
[16] "Sungai Siput"           "Taiping"                "Tanjong Malim"         
[19] "Tapah"                  "Arau"                   "Kangar"                
[22] "Padang Besar"           "Seberang Perai Selatan" "Seberang Perai Tengah" 
[25] "Seberang Perai Utara"   "Kota Kinabatangan"      "Kota Samarahan"        
[28] "Matu Daro"              "Padawan"                "Ampang Jaya"           
[31] "Hulu Selangor"          "Kajang"                 "Klang Selatan"         
[34] "Klang Utara"            "Petaling Jaya"          "Serdang"               
[37] "Sg. Buloh"              "Shah Alam"              "Subang Jaya"           
[40] "Sungai Buloh"           "Brickfields"            "Cheras"                
[43] "Dang Wangi"             "Sentul"                 "Wangsa Maju"           

District names in income that are not in district names of msia.

setdiff(unique(income$district), unique(msia$ADM2_EN))
 [1] "Kulai"                  "Tangkak"                "Kecil Lojing"          
 [4] "Bagan Datuk"            "Hulu Perak"             "Larut dan Matang"      
 [7] "Manjung"                "Muallim"                "Selama"                
[10] "Seberang Perai Selatan" "Seberang Perai Tengah"  "Seberang Perai Utara"  
[13] "Kalabakan"              "Telupid"                "Beluru"                
[16] "Bukit Mabong"           "Kabong"                 "Maradong"              
[19] "Pusa"                   "Sebauh"                 "Subis"                 
[22] "Tanjung Manis"          "Tebedu"                 "Telang Usan"           
[25] "Kuala Nerus"            "W.P. Kuala Lumpur"     

District names in inequality that are not in district names of msia.

setdiff(unique(inequality$district), unique(msia$ADM2_EN))
 [1] "Kulai"                  "Tangkak"                "Kecil Lojing"          
 [4] "Bagan Datuk"            "Hulu Perak"             "Larut dan Matang"      
 [7] "Manjung"                "Muallim"                "Selama"                
[10] "Seberang Perai Selatan" "Seberang Perai Tengah"  "Seberang Perai Utara"  
[13] "Kalabakan"              "Telupid"                "Beluru"                
[16] "Bukit Mabong"           "Kabong"                 "Maradong"              
[19] "Pusa"                   "Sebauh"                 "Subis"                 
[22] "Tanjung Manis"          "Tebedu"                 "Telang Usan"           
[25] "Kuala Nerus"            "W.P. Kuala Lumpur"     

District names in poverty that are not in district names of msia.

setdiff(unique(poverty$district), unique(msia$ADM2_EN))
 [1] "Kulai"                  "Tangkak"                "Kecil Lojing"          
 [4] "Bagan Datuk"            "Hulu Perak"             "Larut dan Matang"      
 [7] "Manjung"                "Muallim"                "Selama"                
[10] "Seberang Perai Selatan" "Seberang Perai Tengah"  "Seberang Perai Utara"  
[13] "Kalabakan"              "Telupid"                "Beluru"                
[16] "Bukit Mabong"           "Kabong"                 "Maradong"              
[19] "Pusa"                   "Sebauh"                 "Subis"                 
[22] "Tanjung Manis"          "Tebedu"                 "Telang Usan"           
[25] "Kuala Nerus"            "W.P. Kuala Lumpur"     

There is no easy way to fix this but to google the districts mentioned in crime and try to map it as close as close to the district in the sf file.

We shall create a function for renaming the districts in the aspatial data to match that of the geospatial data:

rename_districts <- function(data) {
  data <- data %>%
    mutate(district = case_when(
      district %in% c("Iskandar Puteri", "Nusajaya", "Johor Bahru Selatan", "Johor Bahru Utara", "Seri Alam") ~ "Johor Bahru",
      district == "Bandar Bharu" ~ "Bandar Baharu",
      district %in% c("Brickfields", "Cheras", "Dang Wangi", "Sentul", "Wangsa Maju","W.P. Kuala Lumpur") ~ "WP. Kuala Lumpur",
      district == "Nilai" ~ "Seremban",
      district == "Cameron Highland" ~ "Cameron Highlands",
      district == "Kuala Lipis" ~ "Lipis",
      district %in% c("Batu Gajah", "Ipoh") ~ "Kinta",
      district == "Gerik" ~ "Ulu Perak",
      district == "Manjung" ~ "Manjung (Dinding)",
      district == "Pangkalan Hulu" ~ "Ulu Perak",
      district %in% c("Selama", "Taiping", "Larut dan Matang") ~ "Larut Dan Matang",
      district == "Sungai Siput" ~ "Kuala Kangsar",
      district %in% c("Tanjong Malim", "Tapah", "Bagan Datuk", "Muallim") ~ "Batang Padang",
      district %in% c("Arau", "Kangar", "Padang Besar") ~ "Perlis",
      state == "Pulau Pinang" & district == "Seberang Perai Selatan" ~ "S.P.Selatan",
      district == "Seberang Perai Tengah" ~ "S.P. Tengah",
      district == "Seberang Perai Utara" ~ "S.P. Utara",
      district == "Ampang Jaya" ~ "Gombak",
      district == "Kajang" ~ "Ulu Langat",
      district %in% c("Pengkalan Hulu","Hulu Perak") ~ "Ulu Perak",
      district == "Hulu Selangor" ~ "Ulu Selangor",
      district %in% c("Klang Selatan", "Klang Utara") ~ "Klang",
      district %in% c("Petaling Jaya", "Serdang", "Sg. Buloh", "Shah Alam", "Subang Jaya", "Sungai Buloh") ~ "Petaling",
      district == "Kota Kinabatangan" ~ "Kinabatangan",
      district == "Kota Samarahan" ~ "Samarahan",
      district %in% c("Matu Daro", "Tanjung Manis") ~ "Mukah",
      district == "Padawan" ~ "Kuching",
      district == "Kulai" ~ "Kulaijaya",
      district == "Tangkak" ~ "Ledang",
      district == "Kecil Lojing" ~ "Gua Musang",
      district == "Kalabakan" ~ "Tawau",
      district == "Telupid" ~ "Beluran",
      district == "Beluru" ~ "Miri",
      district == "Bukit Mabong" ~ "Kapit",
      district == "Kabong" ~ "Saratok",
      district == "Maradong" ~ "Meradong",
      district == "Pusa" ~ "Betong",
      district == "Sebauh" ~ "Bintulu",
      district == "Subis" ~ "Miri",
      district == "Tebedu" ~ "Serian",
      district == "Telang Usan" ~ "Marudi",
      district == "Kuala Nerus" ~ "Kuala Terengganu",
      TRUE ~ district
    ))
  
  return(data)
}

Apply the renaming function to crime data:

crime_filtered <- rename_districts(crime_filtered)

Remove and combine duplicates data that are considered to be in the same district after renaming:

crime_filtered <- crime_filtered %>%
  group_by(state, district, category, type, year) %>% 
  summarize(crimes = sum(crimes), .groups = 'drop')

Do the same for the other aspatial data:

income <- rename_districts(income)
income <- income %>%
  group_by(state, district, year) %>% 
  summarize(
    income_mean = sum(income_mean),
    income_median = sum(income_median),
    .groups = 'drop'
  )
inequality <- rename_districts(inequality)
inequality <- inequality %>%
  group_by(state, district, year) %>% 
  summarize(
    inequality = sum(gini),
    .groups = 'drop'
  )
poverty <- rename_districts(poverty)
poverty <- poverty %>%
  group_by(state, district, year) %>% 
  summarize(
    poverty_relative = sum(poverty_relative),
    poverty_absolute = sum(poverty_absolute),
    .groups = 'drop'
  )

Check if there is still any mismatched district names:

setdiff(unique(crime_filtered$district), unique(msia$ADM2_EN))
character(0)
setdiff(unique(income$district), unique(msia$ADM2_EN))
character(0)
setdiff(unique(inequality$district), unique(msia$ADM2_EN))
character(0)
setdiff(unique(poverty$district), unique(msia$ADM2_EN))
character(0)

Joining of data

Joining the aspatial data together

combined_data <- crime_filtered %>%
  left_join(poverty, by = c("state", "district", "year")) %>%
  left_join(inequality, by = c("state", "district", "year")) %>%
  left_join(income, by = c("state", "district", "year"))

Filtering out redundant data and keep only relevant data

msia_geometry <- msia %>%
  select(1, 13:15)

Joining with geospatial data

combined_data <- combined_data %>%
  left_join(msia_geometry, by = c("district" = "ADM2_EN"))

GWR

Type of crime selection

For our Shiny app, we would like user to be able to freely choose the type of crime for their own viewing. This could be the individual types or all of the types.

To get types of crime to only be murder:

combined_data_murder <- combined_data %>%
  filter(type == "murder")

To get types of crime to only be causing_injury:

combined_data_injury <- combined_data %>%
  filter(type == "causing_injury")

To get types of crime to be all:

combined_data_all <- combined_data %>%
  filter(type == "all")

To get types of crime to be causing_injury and murderin 2022 only:

combined_data_filtered <- combined_data %>%
  filter(type %in% c("causing_injury", "murder") & year %in% c(2022))

While using filter(type == "causing_injury") is concise but filter(type %in% c("causing_injury")) allows the flexibility for one or more independent variables. This is beneficial for us when we want to allow users to have the freedom to select their own variable(s).

Correlation matrix

We would also like for the user to be able to view the correlation matrix so that they can freely choose which independent variables to be used on GWR conducted in our application.

For murder cases:

ggcorrmat(combined_data_murder[, 6:11])

For causing injury cases:

ggcorrmat(combined_data_injury[, 6:11])

An interesting parameter that can be included in ggcorrmat() is that we can choose the significance level of the matrix. This allows user to have more control over viewing the correlation.

ggcorrmat(combined_data_all[, 6:11], sig.level = 0.05)

Note: Income median and mean are highly correlated since they are similar and came from the same data set. Either of the variable might be removed for GWR in the Shiny App.

The correlation analysis will be under a tabset where user can toggle from the main GWR. On the side panel, selectInput() will be used to select independent variables that the user wants to view and the type of crimes, another checkboxGroupInput() to select the year(s) and sliderInput() will be used for them to select the significance level of the correlation matrix.,

Linear regression model

To allow user is able to freely select which independent variables and type of crimes we need to create a function that dynamically accepts inputs.

run_regression <- function(data, response, predictors) {
  # Create formula from response and predictors
  formula <- as.formula(
    paste(response, "~", paste(predictors, collapse = " + "))
  )
  
  # Run the linear model
  model <- lm(formula = formula, data = data)
  
  return(model)
}

An example of using this function with all the variables as the independent variables and only using murder as the type of crime:

# Define predictors as a vector of variable names
predictors <- c("poverty_relative", "poverty_absolute", "inequality", "income_mean", "income_median")

# Run the function with the specified data and predictors
murder_model <- run_regression(
  data = combined_data_murder,
  response = "crimes",
  predictors = predictors
)

ols_regress(murder_model)
                         Model Summary                           
----------------------------------------------------------------
R                       0.695       RMSE                  2.760 
R-Squared               0.483       MSE                   7.797 
Adj. R-Squared          0.472       Coef. Var           132.240 
Pred R-Squared          0.444       AIC                1279.749 
MAE                     1.805       SBC                1304.674 
----------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                ANOVA                                 
---------------------------------------------------------------------
                Sum of                                               
               Squares         DF    Mean Square      F         Sig. 
---------------------------------------------------------------------
Regression    1847.330          5        369.466    47.386    0.0000 
Residual      1980.435        254          7.797                     
Total         3827.765        259                                    
---------------------------------------------------------------------

                                      Parameter Estimates                                        
------------------------------------------------------------------------------------------------
           model       Beta    Std. Error    Std. Beta      t        Sig       lower      upper 
------------------------------------------------------------------------------------------------
     (Intercept)      0.165         0.582                  0.283    0.777     -0.981      1.310 
poverty_relative      0.029         0.020        0.104     1.509    0.133     -0.009      0.068 
poverty_absolute      0.018         0.024        0.057     0.757    0.450     -0.029      0.065 
      inequality    -22.041         3.030       -0.742    -7.274    0.000    -28.008    -16.073 
     income_mean      0.005         0.000        3.634    10.779    0.000      0.004      0.006 
   income_median     -0.005         0.001       -2.757    -8.686    0.000     -0.006     -0.004 
------------------------------------------------------------------------------------------------

An example of using this function with poverty absolute, inequality and median income as the independent variables and all type of crime:

# Define predictors as a vector of variable names
predictors <- c("poverty_absolute", "inequality", "income_median")

# Run the function with the specified data and predictors
all_model <- run_regression(
  data = combined_data_all,
  response = "crimes",
  predictors = predictors
)

ols_regress(all_model)
                           Model Summary                             
--------------------------------------------------------------------
R                         0.408       RMSE                  595.003 
R-Squared                 0.167       MSE                356772.709 
Adj. R-Squared            0.162       Coef. Var             232.526 
Pred R-Squared            0.142       AIC                  8129.805 
MAE                     271.215       SBC                  8151.074 
--------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                   ANOVA                                    
---------------------------------------------------------------------------
                     Sum of                                                
                    Squares         DF     Mean Square      F         Sig. 
---------------------------------------------------------------------------
Regression     36800562.495          3    12266854.165    34.383    0.0000 
Residual      184094717.629        516      356772.709                     
Total         220895280.123        519                                     
---------------------------------------------------------------------------

                                         Parameter Estimates                                          
-----------------------------------------------------------------------------------------------------
           model         Beta    Std. Error    Std. Beta      t        Sig         lower       upper 
-----------------------------------------------------------------------------------------------------
     (Intercept)       31.321        87.811                  0.357    0.721     -141.190     203.832 
poverty_absolute       -0.599         3.467       -0.011    -0.173    0.863       -7.410       6.211 
      inequality    -1289.314       375.395       -0.256    -3.435    0.001    -2026.804    -551.823 
   income_median        0.141         0.018        0.473     7.924    0.000        0.106       0.176 
-----------------------------------------------------------------------------------------------------

Stepwise regression model selection

While allowing user the freedom of controlling the model used, we should provide information for them to choose model that suit their needs.

Function for selecting the type of stepwise regression:

run_stepwise_selection <- function(model, direction = "forward", p_val = 0.05, details = FALSE) {
  if (!direction %in% c("forward", "backward", "both")) {
    stop("Invalid direction. Choose from 'forward', 'backward', or 'both'.")
  }
  
  stepwise_model <- switch(
    direction,
    "forward" = ols_step_forward_p(model, p_val = p_val, details = details),
    "backward" = ols_step_backward_p(model, p_val = p_val, details = details),
    "both" = ols_step_both_p(model, p_val = p_val, details = details)
  )
  
  return(stepwise_model)
}

Forward selection

murder_fw_mlr <- run_stepwise_selection(
  model = murder_model,
  direction = "forward",
  p_val = 0.05,
  details = FALSE
)

print(murder_fw_mlr)

                                Stepwise Summary                                
------------------------------------------------------------------------------
Step    Variable           AIC         SBC        SBIC        R2       Adj. R2 
------------------------------------------------------------------------------
 0      Base Model       1441.080    1448.202    701.849    0.00000    0.00000 
 1      income_mean      1374.046    1384.728    634.769    0.23319    0.23022 
 2      inequality       1343.382    1357.625    604.139    0.32372    0.31846 
 3      income_median    1279.575    1297.379    541.796    0.47494    0.46879 
------------------------------------------------------------------------------

Final Model Output 
------------------

                         Model Summary                           
----------------------------------------------------------------
R                       0.689       RMSE                  2.780 
R-Squared               0.475       MSE                   7.851 
Adj. R-Squared          0.469       Coef. Var           132.696 
Pred R-Squared          0.445       AIC                1279.575 
MAE                     1.853       SBC                1297.379 
----------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                ANOVA                                 
---------------------------------------------------------------------
                Sum of                                               
               Squares         DF    Mean Square      F         Sig. 
---------------------------------------------------------------------
Regression    1817.974          3        605.991    77.189    0.0000 
Residual      2009.792        256          7.851                     
Total         3827.765        259                                    
---------------------------------------------------------------------

                                     Parameter Estimates                                      
---------------------------------------------------------------------------------------------
        model       Beta    Std. Error    Std. Beta      t        Sig       lower      upper 
---------------------------------------------------------------------------------------------
  (Intercept)      0.004         0.575                  0.007    0.995     -1.129      1.137 
  income_mean      0.005         0.000        3.505    10.577    0.000      0.004      0.006 
   inequality    -17.547         1.825       -0.591    -9.616    0.000    -21.141    -13.954 
income_median     -0.005         0.001       -2.710    -8.587    0.000     -0.006     -0.004 
---------------------------------------------------------------------------------------------

Backward Elimination

murder_bw_mlr <- run_stepwise_selection(
  model = murder_model,
  direction = "backward",
  p_val = 0.05,
  details = FALSE
)

print(murder_bw_mlr)

                                 Stepwise Summary                                  
---------------------------------------------------------------------------------
Step    Variable              AIC         SBC        SBIC        R2       Adj. R2 
---------------------------------------------------------------------------------
 0      Full Model          1279.749    1304.674    542.184    0.48261    0.47243 
 1      poverty_absolute    1278.335    1299.699    540.699    0.48145    0.47331 
 2      poverty_relative    1279.575    1297.379    541.796    0.47494    0.46879 
---------------------------------------------------------------------------------

Final Model Output 
------------------

                         Model Summary                           
----------------------------------------------------------------
R                       0.689       RMSE                  2.780 
R-Squared               0.475       MSE                   7.851 
Adj. R-Squared          0.469       Coef. Var           132.696 
Pred R-Squared          0.445       AIC                1279.575 
MAE                     1.853       SBC                1297.379 
----------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                ANOVA                                 
---------------------------------------------------------------------
                Sum of                                               
               Squares         DF    Mean Square      F         Sig. 
---------------------------------------------------------------------
Regression    1817.974          3        605.991    77.189    0.0000 
Residual      2009.792        256          7.851                     
Total         3827.765        259                                    
---------------------------------------------------------------------

                                     Parameter Estimates                                      
---------------------------------------------------------------------------------------------
        model       Beta    Std. Error    Std. Beta      t        Sig       lower      upper 
---------------------------------------------------------------------------------------------
  (Intercept)      0.004         0.575                  0.007    0.995     -1.129      1.137 
   inequality    -17.547         1.825       -0.591    -9.616    0.000    -21.141    -13.954 
  income_mean      0.005         0.000        3.505    10.577    0.000      0.004      0.006 
income_median     -0.005         0.001       -2.710    -8.587    0.000     -0.006     -0.004 
---------------------------------------------------------------------------------------------

Bidirectional Elimination

murder_sb_mlr <- run_stepwise_selection(
  model = murder_model,
  direction = "both",
  p_val = 0.05,
  details = FALSE
)

print(murder_sb_mlr)

                                   Stepwise Summary                                    
-------------------------------------------------------------------------------------
Step    Variable                  AIC         SBC        SBIC        R2       Adj. R2 
-------------------------------------------------------------------------------------
 0      Base Model              1441.080    1448.202    701.849    0.00000    0.00000 
 1      income_mean (+)         1374.046    1384.728    634.769    0.23319    0.23022 
 2      inequality (+)          1343.382    1357.625    604.139    0.32372    0.31846 
 3      income_median (+)       1279.575    1297.379    541.796    0.47494    0.46879 
 4      poverty_relative (+)    1278.335    1299.699    540.699    0.48145    0.47331 
-------------------------------------------------------------------------------------

Final Model Output 
------------------

                         Model Summary                           
----------------------------------------------------------------
R                       0.694       RMSE                  2.763 
R-Squared               0.481       MSE                   7.784 
Adj. R-Squared          0.473       Coef. Var           132.130 
Pred R-Squared          0.448       AIC                1278.335 
MAE                     1.813       SBC                1299.699 
----------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                ANOVA                                 
---------------------------------------------------------------------
                Sum of                                               
               Squares         DF    Mean Square      F         Sig. 
---------------------------------------------------------------------
Regression    1842.865          4        460.716    59.188    0.0000 
Residual      1984.900        255          7.784                     
Total         3827.765        259                                    
---------------------------------------------------------------------

                                      Parameter Estimates                                        
------------------------------------------------------------------------------------------------
           model       Beta    Std. Error    Std. Beta      t        Sig       lower      upper 
------------------------------------------------------------------------------------------------
     (Intercept)      0.102         0.575                  0.178    0.859     -1.031      1.236 
     income_mean      0.005         0.000        3.623    10.767    0.000      0.004      0.006 
      inequality    -20.859         2.594       -0.702    -8.040    0.000    -25.968    -15.750 
   income_median     -0.005         0.001       -2.774    -8.770    0.000     -0.006     -0.004 
poverty_relative      0.033         0.019        0.119     1.788    0.075     -0.003      0.070 
------------------------------------------------------------------------------------------------

Visualising model metrics

metric <- compare_performance(murder_model, 
                    murder_fw_mlr$model,
                    murder_bw_mlr$model,
                    murder_sb_mlr$model)
metric$Name <- gsub(".*\\\\([a-zA-Z0-9_]+)\\\\, \\\\model\\\\.*", "\\1", metric$Name)
plot(metric)

While the plot looks fine, the naming of the models are confusing and special care should be taken to ensure that they are indicative to the reader what model this is.

The correlation analysis will be under a tabset where user can toggle from the main GWR. On the side panel, selectInput() will be used to select the type of crimes and the independent variables that the user wants to view, another checkboxGroupInput() to select the year(s) and sliderInput() will be used for them to select the p value of the regression models.

Checks

Linearity test

out <- plot(check_model(murder_sb_mlr$model, 
                        panel = FALSE))
For confidence bands, please install `qqplotr`.
out[[2]]

Normality test

plot(check_normality(murder_sb_mlr$model))
For confidence bands, please install `qqplotr`.

A histogram might be a better approach for user to tell the distribution at a glance instead.

ols_plot_resid_hist(murder_sb_mlr$model)

Outliers

plot(check_outliers(murder_sb_mlr$model,
                           method = "cook"))

Multicollinearity

plot(check_collinearity(murder_sb_mlr$model)) + 
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
Variable `Component` is not in your data frame :/

The checks will be made up of the multicollinearity, normality, outliers, and collinearity graphs. It will be under a tabset where user can toggle from the main GWR. On the side panel, selectInput() will be used to select independent variables that the user wants to view and the type of crimes, another checkboxGroupInput() to select the year(s).

Testing for Spatial Autocorrelation

mlr_output <- as.data.frame(murder_sb_mlr$model$residuals) %>%
  rename(`SB_MLR_RES` = `murder_sb_mlr$model$residuals`)

Due to empty rows we need to pad “NA”

murder_residual <- data.frame(MLR_RES = rep(NA, nrow(combined_data_murder)))

murder_residual[rownames(mlr_output), "MLR_RES"] <- mlr_output$SB_MLR_RES
combined_data_murder <- cbind(combined_data_murder, 
                        murder_residual)
combined_data_murder_st <- st_as_sf(combined_data_murder)

Visualising:

tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(combined_data_murder_st) +
  tm_polygons(col = "MLR_RES", alpha = 0.6)+
  tm_view(set.zoom.limits = c(5, 9))
Variable(s) "MLR_RES" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
tmap_mode("plot")
tmap mode set to plotting

Further tests for spatial autocorrelation will be done by my group mate.

GWmodel

Compute bandwith

Filter out rows with NA or Inf in the relevant columns:

combined_data_murder_st_clean <- combined_data_murder_st %>%
  filter(
    !is.na(crimes) & !is.na(poverty_relative) & !is.na(poverty_absolute) &
    !is.na(inequality) & !is.na(income_mean) & !is.na(income_median) &
    is.finite(crimes) & is.finite(poverty_relative) & is.finite(poverty_absolute) &
    is.finite(inequality) & is.finite(income_mean) & is.finite(income_median)
  )

Users will be able to choose for:

  • approach - CV or AIC corrected

  • kernel- gaussian, exponential, bisquare, tricube

  • adaptive - true or false (adaptive or fixed bandwidth)

bw.fixed <- bw.gwr(formula = crimes ~ poverty_relative + poverty_absolute + inequality +
                     income_mean + income_median, 
                   data=combined_data_murder_st_clean, 
                   approach="CV", 
                   kernel="gaussian", 
                   adaptive=FALSE, 
                   longlat=FALSE)
Fixed bandwidth: 1294962 CV score: 2081.87 
Fixed bandwidth: 800490.4 CV score: 2045.571 
Fixed bandwidth: 494890.3 CV score: 2037.256 
Fixed bandwidth: 306019.1 CV score: 2028.107 
Fixed bandwidth: 189290.2 CV score: 2001.025 
Fixed bandwidth: 117147.8 CV score: 1921.412 
Fixed bandwidth: 72561.4 CV score: 1865.845 
Fixed bandwidth: 45005.46 CV score: 2258.148 
Fixed bandwidth: 89591.91 CV score: 1860.898 
Fixed bandwidth: 100117.3 CV score: 1875.523 
Fixed bandwidth: 83086.83 CV score: 1859.626 
Fixed bandwidth: 79066.47 CV score: 1860.707 
Fixed bandwidth: 85571.55 CV score: 1859.599 
Fixed bandwidth: 87107.19 CV score: 1859.875 
Fixed bandwidth: 84622.47 CV score: 1859.545 
Fixed bandwidth: 84035.91 CV score: 1859.552 
Fixed bandwidth: 84984.99 CV score: 1859.556 
Fixed bandwidth: 84398.42 CV score: 1859.544 
Fixed bandwidth: 84259.96 CV score: 1859.546 
Fixed bandwidth: 84484 CV score: 1859.544 
Fixed bandwidth: 84536.89 CV score: 1859.544 
Fixed bandwidth: 84451.31 CV score: 1859.544 
gwr.fixed <- gwr.basic(formula = crimes ~ poverty_relative + poverty_absolute + inequality +
                     income_mean + income_median, 
                       data=combined_data_murder_st_clean, 
                       bw=bw.fixed, 
                       kernel = 'gaussian', 
                       longlat = FALSE)
gwr.fixed
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-03 21:45:15.126012 
   Call:
   gwr.basic(formula = crimes ~ poverty_relative + poverty_absolute + 
    inequality + income_mean + income_median, data = combined_data_murder_st_clean, 
    bw = bw.fixed, kernel = "gaussian", longlat = FALSE)

   Dependent (y) variable:  crimes
   Independent variables:  poverty_relative poverty_absolute inequality income_mean income_median
   Number of data points: 260
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-5.8826 -1.4242 -0.3438  0.9802 13.9557 

   Coefficients:
                      Estimate Std. Error t value Pr(>|t|)    
   (Intercept)       1.647e-01  5.817e-01   0.283    0.777    
   poverty_relative  2.942e-02  1.950e-02   1.509    0.133    
   poverty_absolute  1.813e-02  2.395e-02   0.757    0.450    
   inequality       -2.204e+01  3.030e+00  -7.274 4.33e-12 ***
   income_mean       5.324e-03  4.939e-04  10.779  < 2e-16 ***
   income_median    -4.848e-03  5.581e-04  -8.686 4.65e-16 ***

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 2.792 on 254 degrees of freedom
   Multiple R-squared: 0.4826
   Adjusted R-squared: 0.4724 
   F-statistic: 47.39 on 5 and 254 DF,  p-value: < 2.2e-16 
   ***Extra Diagnostic information
   Residual sum of squares: 1980.435
   Sigma(hat): 2.770578
   AIC:  1279.749
   AICc:  1280.194
   BIC:  1083.599
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: gaussian 
   Fixed bandwidth: 84484 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                           Min.     1st Qu.      Median     3rd Qu.    Max.
   Intercept        -47.2950014  -0.7340682  -0.0773994   0.7073472  2.6754
   poverty_relative  -0.2973646  -0.0618064  -0.0199016   0.0046969  0.2697
   poverty_absolute  -0.1670997  -0.0053479   0.0356065   0.0853528  1.0949
   inequality       -32.9982183 -22.0400396 -10.5543811  -2.3699028 88.4842
   income_mean       -0.0018564   0.0022187   0.0034693   0.0064328  0.0077
   income_median     -0.0074840  -0.0060976  -0.0030624  -0.0016581  0.0021
   ************************Diagnostic information*************************
   Number of data points: 260 
   Effective number of parameters (2trace(S) - trace(S'S)): 70.86396 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 189.136 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1284.53 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1192.465 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1190.494 
   Residual sum of squares: 1201.836 
   R-square value:  0.6860216 
   Adjusted R-square value:  0.5677575 

   ***********************************************************************
   Program stops at: 2024-11-03 21:45:15.265036 
combined_data_murder_st_clean_adaptive <- st_as_sf(gwr.fixed$SDF) %>%
  st_transform(crs=3168)

Visualising

This map shows the local R2

tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(combined_data_murder_st_clean_adaptive) +
  tm_polygons(col = "Local_R2", alpha = 0.6) +
  tm_view(set.zoom.limits = c(5, 9))
tmap_mode("plot")
tmap mode set to plotting

This would show the local R2 on the left and coefficent estimates on the right to allow comparison on how well the local regression model explains the coefficent estimates.

tmap_mode("view")
tmap mode set to interactive viewing
Local_R2 <- tm_shape(combined_data_murder_st_clean_adaptive) +
  tm_polygons(col = "Local_R2", alpha = 0.6) +
  tm_view(set.zoom.limits = c(5, 9))

Inequality_TV <- tm_shape(combined_data_murder_st_clean_adaptive) +
  tm_polygons(col = "inequality_TV", alpha = 0.6) +
  tm_view(set.zoom.limits = c(5, 9))

tmap_arrange(Local_R2, Inequality_TV, asp = 1, ncol = 2, sync = TRUE)
Variable(s) "inequality_TV" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
tmap_mode("plot")
tmap mode set to plotting

This will be the main tabset. On the side panel, selectInput() will be used to select independent variables that the user wants to view and the type of crimes, another checkboxGroupInput() to select the year(s). The three selectInput() at the bottom of the side panel will be used to control the bandwidth computation parameters.

Proposed Shiny Application

Model tabset

Correlation tabset

Model Selection

Checks tabset